AITopics | weight sequence

Collaborating Authors

weight sequence

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Less is More: Efficient Weight Farcasting with 1-Layer Neural Network

Shou, Xiao, Bhattacharjya, Debarun, Ding, Yanna, Zhao, Chen, Li, Rui, Gao, Jianxi

arXiv.org Artificial IntelligenceMay-6-2025

Addressing the computational challenges inherent in training large-scale deep neural networks remains a critical endeavor in contemporary machine learning research. While previous efforts have focused on enhancing training efficiency through techniques such as gradient descent with momentum, learning rate scheduling, and weight regularization, the demand for further innovation continues to burgeon as model sizes keep expanding. In this study, we introduce a novel framework which diverges from conventional approaches by leveraging long-term time series forecasting techniques. Our method capitalizes solely on initial and final weight values, offering a streamlined alternative for complex model architectures. We also introduce a novel regularizer that is tailored to enhance the forecasting performance of our approach. Empirical evaluations conducted on synthetic weight sequences and real-world deep learning architectures, including the prominent large language model DistilBERT, demonstrate the superiority of our method in terms of forecasting accuracy and computational efficiency. Notably, our framework showcases improved performance while requiring minimal additional computational overhead, thus presenting a promising avenue for accelerating the training process across diverse tasks and architectures.

artificial intelligence, machine learning, neural network, (13 more...)

arXiv.org Artificial Intelligence

2505.02714

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Tracking Dynamic Gaussian Density with a Theoretically Optimal Sliding Window Approach

Wang, Yinsong, Ding, Yu, Shahrampour, Shahin

arXiv.org Machine LearningMar-11-2024

Dynamic density estimation is ubiquitous in many applications, including computer vision and signal processing. One popular method to tackle this problem is the "sliding window" kernel density estimator. There exist various implementations of this method that use heuristically defined weight sequences for the observed data. The weight sequence, however, is a key aspect of the estimator affecting the tracking performance significantly. In this work, we study the exact mean integrated squared error (MISE) of "sliding window" Gaussian Kernel Density Estimators for evolving Gaussian densities. We provide a principled guide for choosing the optimal weight sequence by theoretically characterizing the exact MISE, which can be formulated as constrained quadratic programming. We present empirical evidence with synthetic datasets to show that our weighting scheme indeed improves the tracking performance compared to heuristic approaches.

kernel density estimator, sequence, weight sequence, (11 more...)

arXiv.org Machine Learning

2403.07207

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

TAKDE: Temporal Adaptive Kernel Density Estimator for Real-Time Dynamic Density Estimation

Wang, Yinsong, Ding, Yu, Shahrampour, Shahin

arXiv.org Machine LearningNov-8-2023

Real-time density estimation is ubiquitous in many applications, including computer vision and signal processing. Kernel density estimation is arguably one of the most commonly used density estimation techniques, and the use of "sliding window" mechanism adapts kernel density estimators to dynamic processes. In this paper, we derive the asymptotic mean integrated squared error (AMISE) upper bound for the "sliding window" kernel density estimator. This upper bound provides a principled guide to devise a novel estimator, which we name the temporal adaptive kernel density estimator (TAKDE). Compared to heuristic approaches for "sliding window" kernel density estimator, TAKDE is theoretically optimal in terms of the worst-case AMISE. We provide numerical experiments using synthetic and real-world datasets, showing that TAKDE outperforms other state-of-the-art dynamic density estimators (including those outside of kernel family). In particular, TAKDE achieves a superior test log-likelihood with a smaller runtime.

artificial intelligence, density estimator, estimator, (16 more...)

arXiv.org Machine Learning

2203.08317

Country:

North America > United States > Texas > Brazos County > College Station (0.14)
North America > United States > California (0.14)
Asia > China > Hong Kong (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)

Add feedback

Multiwinner Approval Rules as Apportionment Methods

Brill, Markus (University of Oxford) | Laslier, Jean-Francois (Paris School of Economics) | Skowron, Piotr (University of Oxford)

AAAI ConferencesFeb-14-2017

We establish a link between multiwinner elections and apportionment problems by showing how approval-based multiwinner election rules can be interpreted as methods of apportionment. We consider several multi-winner rules and observe that some, but not all, of them induce apportionment methods that are well established in the literature and in the actual practice of proportional representation. For instance, we show that Proportional Approval Voting induces the D'Hondt method and that Monroe's rule induces the largest remainder method. We also consider properties of apportionment methods and exhibit multiwinner rules that induce apportionment methods satisfying these properties.

apportionment method, artificial intelligence, owa-based rule, (13 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Country: Europe (0.93)

Industry: Government > Voting & Elections (0.93)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

$QD$-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus + Innovations

Kar, Soummya, Moura, Jose' M. F., Poor, H. Vincent

arXiv.org Machine LearningOct-24-2012

The paper considers a class of multi-agent Markov decision processes (MDPs), in which the network agents respond differently (as manifested by the instantaneous one-stage random costs) to a global controlled state and the control actions of a remote controller. The paper investigates a distributed reinforcement learning setup with no prior information on the global state transition and local agent cost statistics. Specifically, with the agents' objective consisting of minimizing a network-averaged infinite horizon discounted cost, the paper proposes a distributed version of $Q$-learning, $\mathcal{QD}$-learning, in which the network agents collaborate by means of local processing and mutual information exchange over a sparse (possibly stochastic) communication network to achieve the network goal. Under the assumption that each agent is only aware of its local online cost data and the inter-agent communication network is \emph{weakly} connected, the proposed distributed scheme is almost surely (a.s.) shown to yield asymptotically the desired value function and the optimal stationary control policy at each network agent. The analytical techniques developed in the paper to address the mixed time-scale stochastic dynamics of the \emph{consensus + innovations} form, which arise as a result of the proposed interactive distributed scheme, are of independent interest.

machine learning, reinforcement learning, state-action pair, (18 more...)

arXiv.org Machine Learning

doi: 10.1109/TSP.2013.2241057

1205.0047

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Industry:

Leisure & Entertainment > Sports > Soccer (0.67)
Banking & Finance (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback